Hough Transform¶

Hough Transform can detect various shapes in images, it is not strictly limited to lines and circles. These shapes encompass linear objects like roads and pens and circular objects such as street signs or apples in a fruit packaging factory. In this notebook, I demonstrate how Hough transform can discern structural features within images.

RULES: As usual, OpenCV is banned in this repository. However, we are allowed to leverage image processing libraries, such as Canny or other operators, for edge detection. IMPORTANTLY, WE ARE PROHIBITED FROM USING ANY HOUGH TRANSFORM TOOLS. This entails creating custom accumulator array data structures and implementing code for voting and peak identification to detect lines and circles.

In [ ]:
import numpy as np
import matplotlib.pyplot as plt
from skimage import feature, filters
import hough_transform_utils
import image_processing_utils
import itertools

gray_img_path = "./input/ps1-input0.png"
gray_noise_img_path = "./input/ps1-input0-noise.png"
easy_img_path = "./input/ps1-input1.png"
middle_img_path = "./input/ps1-input2.png"
hard_img_path = "./input/ps1-input3.png"

Canny Edge Detection¶

To initiate our exploration, we'll begin with a basic artificial grayscale image. Our primary objective is to detect six straight lines within this image using the Canny edge detector and the Hough transform. To add a touch of realism, we'll utilize two versions of the image, one without noise and one with noise, for our experiments. This will help us understand how parameter settings interact with noise and guide us in fine-tuning these parameters for real-world data.

Image 1 Image 2
Left: grayscale image. Right: grayscale image with noise.

The provided code first utilizes Otsu's method to determine the optimal threshold and then applies the Canny edge detector to the grayscale images. Notably, it's observed that a larger $\sigma$ detects larger-scale edges, while a smaller $\sigma$ captures finer features. Selecting an ideal $\sigma$ can be challenging in various scenarios. In our case, I believe $\sigma =1.5$ is a suitable estimate because it can capture finer details without overwhelming the image with excessive detail.

In [ ]:
gray_img = image_processing_utils.read_image(gray_img_path)  # m x n
gray_noise_img = image_processing_utils.read_image(gray_noise_img_path)  # m x n

# Use Ostu's method to find the optimal threshold.
optimal_thresh = filters.threshold_otsu(gray_img)
low_thresh = 0.5 * optimal_thresh

optimal_gray_thresh = filters.threshold_otsu(gray_noise_img)
low_gray_thresh = 0.5 * optimal_gray_thresh

edges1 = feature.canny(
    gray_img, sigma=1.0, low_threshold=low_thresh, high_threshold=optimal_thresh
)
edges2 = feature.canny(
    gray_img, sigma=1.5, low_threshold=low_thresh, high_threshold=optimal_thresh
)
edges3 = feature.canny(
    gray_img, sigma=2.0, low_threshold=low_thresh, high_threshold=optimal_thresh
)
gray_edges1 = feature.canny(
    gray_noise_img,
    sigma=1.0,
    low_threshold=low_gray_thresh,
    high_threshold=optimal_gray_thresh,
)
gray_edges2 = feature.canny(
    gray_noise_img,
    sigma=1.5,
    low_threshold=low_gray_thresh,
    high_threshold=optimal_gray_thresh,
)
gray_edges3 = feature.canny(
    gray_noise_img,
    sigma=2.0,
    low_threshold=low_gray_thresh,
    high_threshold=optimal_gray_thresh,
)

# display results
fig, ax = plt.subplots(nrows=2, ncols=4, figsize=(8, 4))
fig.suptitle("Applying Canny edge detector to grayscale images.", fontsize=12)
ax[0, 0].imshow(gray_img, cmap="gray")
ax[0, 0].set_title("Grayscale image", fontsize=12)
ax[0, 1].imshow(edges1, cmap="gray")
ax[0, 1].set_title(r"$\sigma=1$", fontsize=12)
ax[0, 2].imshow(edges2, cmap="gray")
ax[0, 2].set_title(r"$\sigma=1.5$", fontsize=12)
ax[0, 3].imshow(edges2, cmap="gray")
ax[0, 3].set_title(r"$\sigma=2$", fontsize=12)

ax[1, 0].imshow(gray_noise_img, cmap="gray")
ax[1, 0].set_title("Noisy image", fontsize=12)
ax[1, 1].imshow(gray_edges1, cmap="gray")
ax[1, 1].set_title(r"$\sigma=1$", fontsize=12)
ax[1, 2].imshow(gray_edges2, cmap="gray")
ax[1, 2].set_title(r"$\sigma=1.5$", fontsize=12)
ax[1, 3].imshow(gray_edges3, cmap="gray")
ax[1, 3].set_title(r"$\sigma=2$", fontsize=12)
for row in ax:
    for subplot in row:
        subplot.axis("off")
fig.tight_layout()
plt.show()
No description has been provided for this image

Hough Line Transform¶

The simplest case of Hough transform is detecting straight lines. In general, the straight line $y = mx + b$ can be represented as a point $(b, m)$ in the parameter space. However, vertical lines present an issue as they lead to unbounded values for the slope parameter, with the slope of a vertical line being infinite.

Hough Accumulator Array for Lines¶

Now we want to Implement a Hough Transform method for finding lines. Note that the coordinate system used is as pictured below with the origin placed one pixel above and to the left of the upper-left pixel of the image and with the Y-axis pointing downwards.
No description has been provided for this image

Thus, the pixel at $img(r,c)$ corresponds to the $(x,y)$ coordinates $(r,c)$, i.e. $x=c$ and $y=r$. This pixel should vote for line parameters $(\rho, \theta)$ where: $\rho = x⋅cos(\theta) + y⋅sin(\theta)$, and $\theta = atan2(y,x)$.
No description has been provided for this image

This has the effect of making the positive angular direction clockwise instead of counter-clockwise in the usual convention. $\theta = 0$ still points in the direction of the positive X-axis.

We will implement a hough_lines_acc function which can computes the Hough transform for lines and produces an accumulator array. Our code should conform to the specifications of the MATLAB's hough function: http://www.mathworks.com/help/images/ref/hough.html

In [ ]:
def hough_lines_acc(img, rho_res=1, theta_res=1):
    num_rows, num_cols = img.shape[0], img.shape[1]
    rho_max = int(np.sqrt(num_rows**2 + num_cols**2)) + 1
    rhos = np.arange(-rho_max, rho_max, rho_res)

    max_theta = 90
    min_theta = -90
    num_thetas = int((max_theta - min_theta) / theta_res)
    thetas = np.linspace(min_theta, max_theta, num_thetas, endpoint=False)
    thetas = thetas * np.pi / 180  # deg to rad

    hough_matrix = np.zeros((len(rhos), len(thetas)), dtype=np.uint8)

    # `img` is a 2D binary array full of `True` and `False`.
    edge_pixels = np.argwhere(img > 0)

    for y, x in edge_pixels:
        for idx_theta, theta in enumerate(thetas):
            # Convert pixel coordinates to Cartesian coordinates.
            x_coord = x + 1
            y_coord = y + 1

            rho = x_coord * np.cos((theta)) + y_coord * np.sin((theta))
            rho_idx = int((rho + rho_max) / rho_res)
            hough_matrix[rho_idx, idx_theta] += 1

    # NOTE: Why resclaing here?
    hough_matrix = image_processing_utils.rescale_image(
        hough_matrix, (0, 255), normalized=True
    )
    hough_matrix = hough_matrix.astype(np.uint8)
    return hough_matrix, thetas, rhos
In [ ]:
gray_img = image_processing_utils.read_image(gray_img_path)
edge_gray = feature.canny(gray_img, sigma=1)
gray_hough_acc, gray_thetas, gray_rhos = hough_lines_acc(
    edge_gray, rho_res=1, theta_res=1
)

easy_img = image_processing_utils.read_image(easy_img_path)
edge_easy = feature.canny(easy_img, sigma=3)
easy_hough_acc, easy_thetas, easy_rhos = hough_lines_acc(
    edge_easy, rho_res=1, theta_res=1
)

image_data = [
    (gray_img, edge_gray, gray_hough_acc, 1, gray_thetas, gray_rhos),
    (easy_img, edge_easy, easy_hough_acc, 3, easy_thetas, easy_rhos),
]

for image, edges, hough_matrix, canny_sigma, thetas, rhos in image_data:
    # Create a Matplotlib figure and a 2x2 grid of subplots
    fig = plt.figure(figsize=(6, 6))
    fig.suptitle("", fontsize=12)
    ax1 = plt.subplot2grid((2, 2), (0, 0))
    ax2 = plt.subplot2grid((2, 2), (0, 1))
    ax3 = plt.subplot2grid((2, 2), (1, 0), colspan=2)  # This subplot spans two columns
    ax1.imshow(image, cmap="gray")
    ax1.set_title("Grayscale image", fontsize=12)
    ax1.axis("off")
    ax2.imshow(edges, cmap="gray")
    ax2.set_title(f"Canny edge. $\sigma$ = {canny_sigma}", fontsize=12)
    ax2.axis("off")
    ax3.set_title(
        "Each bright pixels in edge image votes once in Hough accumulator array",
        fontsize=10,
    )
    im = ax3.imshow(
        # Times 5 to make pixels value bigger to increase visualization effect.
        hough_matrix * 5,
        cmap="gray",
        aspect="auto",
        extent=(
            np.rad2deg(thetas[0]),  # rad to deg
            np.rad2deg(thetas[-1]),  # rad to deg
            rhos[0],
            rhos[-1],
        ),
        vmin=hough_matrix.min(),
        vmax=hough_matrix.max(),
    )
    ax3.set_xlabel("$\\theta$ $\in$ [-90, 90)")
    ax3.set_ylabel("radius")
    cbar = fig.colorbar(
        im, ax=ax3, orientation="vertical"
    )  # Create a colorbar on the right side of the image
    cbar.set_label("Numbers of votes", rotation=270, labelpad=20)

    fig.tight_layout()
    plt.show()
No description has been provided for this image
No description has been provided for this image

Discussion:

We can observe that every bright pixel in the edge image contributes a vote to the Hough accumulator array. When edges align along a line in the image, these votes accumulate in Hough space, forming a distinct cluster of bright points.

In the first image, there are six evident straight lines, but there are 9 points in the Hough space. This is because horizontal lines correspond to $\theta = 90 \text{ or } -90$ degrees. This could pose an issue as the votes are split between two locations. Careful consideration is needed to accurately determine the number of peaks to avoid missing any lines.

In the second image, there are four distinct points, which are grouped into two pairs in Hough space. This occurs because one pen has two closely spaced edges.

An important implementation detail in hough_lines_acc is that I normalized the Hough accumulator array before returning it. This is crucial because real-world scenarios often involve both short and long lines. Normalizing the accumulator array ensures that the implementation can perform well in unforeseen situations with a mix of short and long lines.

However, it's essential to acknowledge the potential downside of normalization. In situations where no lines are genuinely present in the scene, this implementation may produce misleading results. Therefore, it's important to use this normalization approach judiciously, considering the specific characteristics of the problem and the expected presence of lines in the image.

Finding Peaks and Filtering Lines¶

Next, we can extract a line's parameters, specifically $\theta$ and radius values, from the Hough accumulator by detecting peaks. To maintain precision, I reset neighboring pixels around a peak to zero, preventing the selection of duplicate lines. However, it's worth noting that this implementation is relatively simplistic. In real-world scenarios, a more robust approach might involve finding the average value of a small cluster of pixels instead of relying on a single pixel for peak detection.

Additionally, since we already know that the object we are searching for is a pen with two distinct edges, we can apply specific constraints to filter the detected lines. For instance, we can stipulate that the angle difference between two lines must be less than one degree, or that the distance between two points in Hough space must fall within a certain threshold. By incorporating these constraints, we can enhance the robustness of our pen detector. However, it's important to emphasize that these constraints are only effective when we have a rough idea of what the objects might look like in terms of their shapes and sizes. Without prior information about the objects of interest, applying such constraints becomes challenging.

In [ ]:
def calculate_distance(t1, t2, rho_threshold, theta_threshold):
    a1, b1 = t1
    a2, b2 = t2
    if abs(b1 - b2) > theta_threshold:
        return None
    if abs(a1 - a2) > rho_threshold:
        return None
    return ((a2 - a1) ** 2 + (b2 - b1) ** 2) ** 0.5


def filter_lines(peaks, num_pairs, rho_threshold, theta_threshold):
    new_tmp = {}
    for pair_combination in itertools.combinations(peaks, 2):
        distance = calculate_distance(
            pair_combination[0], pair_combination[1], rho_threshold, theta_threshold
        )
        if distance is not None:
            new_tmp[pair_combination] = distance
    sorted_items = sorted(new_tmp.items(), key=lambda x: x[1])

    peaks = []
    if len(sorted_items) < num_pairs:
        for i in range(len(sorted_items)):
            pairs = sorted_items[i][0]
            peaks.extend(list(pairs))
    else:
        for i in range(num_pairs):
            pairs = sorted_items[i][0]
            peaks.extend(list(pairs))
    return peaks


def find_hough_peaks(hough_matrix, num_peaks=9, threshold=20, neighborhood_size=20):
    accumulator = hough_matrix.copy()
    peaks = []
    for _ in range(num_peaks):
        row_off, col_off = np.unravel_index(np.argmax(accumulator), accumulator.shape)
        assert accumulator[row_off, col_off] == np.max(accumulator)
        max_value = accumulator[row_off, col_off]
        if max_value < threshold:
            break
        # peaks.append((row_off, col_off, max_value))
        peaks.append((row_off, col_off))

        # Suppress nearby peaks by resetting pixels in the neighborhood to zero.
        for i in range(-neighborhood_size, neighborhood_size + 1):
            for j in range(-neighborhood_size, neighborhood_size + 1):
                row = row_off + i
                col = col_off + j
                if 0 <= row < accumulator.shape[0] and 0 <= col < accumulator.shape[1]:
                    accumulator[row, col] = 0

    return peaks
In [ ]:
gray_peaks = find_hough_peaks(gray_hough_acc, num_peaks=20, threshold=40, neighborhood_size=20)
lines_gray = hough_transform_utils.draw_lines(
    gray_img,
    gray_peaks,
    gray_thetas,
    np.linspace(gray_rhos[0], gray_rhos[-1], gray_hough_acc.shape[0]),
)

easy_peaks = find_hough_peaks(easy_hough_acc, num_peaks=20, threshold=40, neighborhood_size=20)
lines_easy = hough_transform_utils.draw_lines(
    easy_img,
    easy_peaks,
    easy_thetas,
    np.linspace(easy_rhos[0], easy_rhos[-1], easy_hough_acc.shape[0]),
)

filter_peaks = filter_lines(easy_peaks, num_pairs=3, rho_threshold=100, theta_threshold=2)
lines_filter = hough_transform_utils.draw_lines(
    easy_img,
    filter_peaks,
    easy_thetas,
    np.linspace(easy_rhos[0], easy_rhos[-1], easy_hough_acc.shape[0]),
)

image_data = [(lines_gray, "Detected lines."), (lines_easy, "Way too much detected lines."), (lines_filter, "After applying distance constraint.")]

fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(9, 3))

for i, (lines_img, subtitle, *_) in enumerate(image_data):
    ax[i].set_title(subtitle)
    ax[i].imshow(lines_img, cmap="gray")
    ax[i].axis('off')

fig.suptitle("All with the same Hough thres. 40 and resetting neighborhood size 20.", fontsize=12)

fig.tight_layout()
plt.show()
No description has been provided for this image

Discussion:

As you can see, our line detector performs effectively in the first artificial image, which comprises only rectangles, and in an artificial scene featuring two types of objects: pens and coins. Fortunately, our filtering strategy successfully eliminates unwanted lines in the Hough space.

Hough Transform on Noisy Images¶

In computer vision, finding optimal parameters for reliable performance in real-world scenarios can be extremely challenging. Images often contain noise and variations in brightness. Hence, it's essential to assess how our Hough transform performs on images with noise.

Furthermore, we can consolidate the previous functions into a single function named detect_lines to simplify the creation of a line detection pipeline.

In [ ]:
def detect_lines(
    img, canny_sigma, num_peaks, threshold, neighborhood_size, filtering_lines=False
):
    assert img.ndim == 2
    assert img.dtype == np.float64

    optimal_thresh = filters.threshold_otsu(img)
    low_thresh = 0.5 * optimal_thresh
    edges = feature.canny(
        img, sigma=canny_sigma, low_threshold=low_thresh, high_threshold=optimal_thresh
    )
    hough_acc, thetas, rhos = hough_lines_acc(edges)
    peaks = find_hough_peaks(
        hough_acc,
        num_peaks=num_peaks,
        threshold=threshold,
        neighborhood_size=neighborhood_size,
    )

    if filtering_lines is True:
        peaks = filter_lines(peaks, num_pairs=3, rho_threshold=50, theta_threshold=2)

    lines_img = hough_transform_utils.draw_lines(
        img,
        peaks,
        thetas,
        np.linspace(rhos[0], rhos[-1], hough_acc.shape[0]),
    )
    return edges, lines_img
In [ ]:
noise_img = image_processing_utils.read_image(gray_noise_img_path)
# Adjust gaussian sigma to control the amount of smoothing.
smooth_img = filters.gaussian(noise_img, sigma=1)

fig, ax = plt.subplots(nrows=2, ncols=5, figsize=(9, 4.5))
image_data = [(noise_img, "Noisy image"), (smooth_img, r"Gauss. Smoothed, $\sigma=1$")]

for i, (img, title) in enumerate(image_data):
    ax[i, 0].imshow(img, cmap="gray")
    ax[i, 0].set_title(title, fontsize=12)
    ax[i, 0].axis("off")
    edges, lines_img = detect_lines(
        img, num_peaks=9, canny_sigma=1.5, threshold=100, neighborhood_size=20
    )

    ax[i, 1].imshow(edges, cmap="gray")
    ax[i, 1].set_title(r"$\sigma=1.5$", fontsize=12)
    ax[i, 1].axis("off")
    ax[i, 2].imshow(lines_img)
    ax[i, 2].set_title(r"Detected lines", fontsize=12)
    ax[i, 2].axis("off")
    edges, lines_img = detect_lines(
        img, num_peaks=9, canny_sigma=2, threshold=100, neighborhood_size=20
    )

    ax[i, 3].imshow(edges, cmap="gray")
    ax[i, 3].set_title(r"$\sigma=2$", fontsize=12)
    ax[i, 3].axis("off")
    ax[i, 4].imshow(lines_img)
    ax[i, 4].set_title(r"Detected lines", fontsize=12)
    ax[i, 4].axis("off")
fig.tight_layout()
plt.show()
No description has been provided for this image

Discussion:

Smoothing a noisy image with a Gaussian kernel can enhance the performance of the Canny edge detector. As observed, when we apply the same Canny edge detector with $\sigma=1.5$ to both the noisy image and the smoothed one, the noisy image tends to generate an excessive number of noisy edges. Conversely, the smoothed image exhibits fewer edges while still preserving the essential ones along the boundaries.

On the other hand, increasing $\sigma$ of the Canny edge detector can reduce the number of noisy edges. However, this may result in the loss of important edges, as evident in the bottom-left corner of the top-right image.

In this context, the most favorable result appears to be the bottom-left image, where there are relatively few noisy edges in the middle, yet all the critical edges along the boundaries are effectively retained by the Canny edge detector.

Hough Line Transform on "Realistic Images"¶

Finally, we can apply our line detection pipeline to "realistic images" featuring books, pens, and coins. These images are categorized into three levels of difficulty: "Easy," "Middle," and "Hard," each described as follows:

  • Easy: In this scenario, two pens and a few coins are positioned on a black blanket, which aids the Canny edge detector in preserving useful edges.

  • Middle: This scenario involves two pens and several coins placed on a book with a cluttered pattern on its cover. This setting generates more noisy edges and poses a challenge to our Hough transform.

  • Hard: Similar to the "Middle" setup, but with distorted viewpoints, making line detection more challenging.

In [ ]:
easy_img = image_processing_utils.read_image(easy_img_path)
smooth_easy = filters.gaussian(easy_img, sigma=2)
middle_img = image_processing_utils.read_image(middle_img_path)
smooth_middle = filters.gaussian(middle_img, sigma=2)
hard_img = image_processing_utils.read_image(hard_img_path)
smooth_hard = filters.gaussian(hard_img, sigma=2)

fig, ax = plt.subplots(nrows=6, ncols=6, figsize=(20, 15))
image_data = [(easy_img, "Easy image"), (smooth_easy, r"Gauss Easy, $\sigma=2$"),
              (middle_img, "Middle image"), (smooth_middle, r"Gauss Middle, $\sigma=2$"),
              (hard_img, "Hard image"), (smooth_hard, r"Gauss Hard, $\sigma=2$")]

for i, (img, title) in enumerate(image_data):
    ax[i, 0].imshow(img, cmap="gray")
    ax[i, 0].text(-60, 250, title, fontsize=12, rotation='vertical', va='center')
    ax[i, 0].axis("off")
    edges, lines_img = detect_lines(
        img, num_peaks=20, canny_sigma=1.5, threshold=40, neighborhood_size=20
    )
    ax[i, 1].imshow(edges, cmap="gray")
    ax[0, 1].set_title(r"$\sigma=1.5$", fontsize=12)
    ax[i, 1].axis("off")

    ax[i, 2].imshow(lines_img)
    ax[0, 2].set_title(r"Detected lines", fontsize=12)
    ax[i, 2].axis("off")
    edges, lines_img = detect_lines(
        img, num_peaks=20, canny_sigma=3, threshold=40, neighborhood_size=20
    )
    ax[i, 3].imshow(edges, cmap="gray")
    ax[0, 3].set_title(r"$\sigma=3$", fontsize=12)
    ax[i, 3].axis("off")

    ax[i, 4].imshow(lines_img)
    ax[0, 4].set_title(r"Detected lines", fontsize=12)
    ax[i, 4].axis("off")

    edges, lines_img = detect_lines(
        img, num_peaks=20, canny_sigma=0.75, threshold=40, neighborhood_size=20, filtering_lines=True
    )
    ax[i, 5].imshow(lines_img)
    ax[0, 5].set_title(r"Filtering lines, $\sigma=0.75$", fontsize=12)
    ax[i, 5].axis("off")

fig.tight_layout()
plt.show()
No description has been provided for this image

Discussion:

The effectiveness of each parameter and the impact of applying constraints to peak finding have been discussed as follows:

  • Gaussian Smoothing: Smoothing images with a Gaussian kernel does have an effect on the performance, particularly in conjunction with the Canny edge detector. It can reduce fine-grained details, similar to increasing the sigma value of the Canny edge detector.

  • Sigma Value $\sigma$ of Canny Edge Detector: Choosing the appropriate $\sigma$ for the Canny edge detector is crucial. It involves finding a balance between preserving important details and suppressing excessive noise. This optimal sigma value can vary across different scenes, and its selection may be guided by heuristic values within a defined interval. Proper preprocessing methods can aid in optimizing this parameter for various scenarios.

  • Number of Peaks vs. Hough Threshold: Balancing the number of peaks and the Hough threshold is a nuanced task. Increasing the number of peaks or lowering the threshold can lead to the detection of more lines. However, deciding which one to tune is a manual process and depends on the specific requirements of the application.

  • Applying Constraints When Finding Peaks: The application of constraints plays a critical role in refining the detection process. Constraints are valuable when you have prior knowledge about the appearance of objects in the scene. However, it's important to recognize that in many real-world scenarios, it may not be feasible to apply constraints, especially when dealing with diverse types of objects and complex scenes, as seen in applications like self-driving cars.

Circle Hough Transform¶

The circle Hough Transform is a basic feature extraction technique used in digital image processing for detecting circles in imperfect images. In a two-dimensional space, a circle can be described by:

$$ (x-a)^2 + (y-b)^2 = r^2 $$

where $(a,b)$ is the center of the circle, and $r$ is the radius. If a 2D point $(x,y)$ and the radius $r$ are fixed, then the parameter space would be two dimensional, defined by $(a, b)$, which specifies the circle's position. But if only a 2D point $(x,y)$ is fixed and the radius $r$ varies, then the parameter space would be three dimensional, (a, b, r), and all the parameters that satisfy $(x, y)$ would lie on the surface of an inverted right-angled cone whose apex is at $(x, y, 0)$. The image below illustrates this concept:

No description has been provided for this image

Hough Accumulator Array for Circles¶

Circle candidates are generated by a process of "voting" in the Hough parameter space, followed by the selection of local maxima in an accumulator matrix. To provide a more detailed explanation, the pseudocode below outlines the steps involved in creating the Hough accumulator for circles with a fixed radius $r$:

hough_circles_acc(edge_image, r):
    For each bright point (x,y):
        For each theta t = 0 to 360:  // the possible theta 0 to 360
            b = y – r * sin(t);       // polar coordinate for center
            a = x – r * cos(t);       // polar coordinate for center
            A[a, b] += 1;             // voting
        end
    end

which is implemented as follow. It's worth noting that I have not chosen to rescale the hough_acc before returning it, which is different from the hough_lines_acc. The rationale behind this distinction lies in the fact that, when searching through a range of different radii, there is a possibility that, for a particular radius value, there may not be a corresponding circle. In such cases, if I were to rescale hough_acc, circles would be generated wherever the largest pixel value is located.

In [ ]:
def hough_circles_acc(binary_image, radius):
    num_rows, num_cols = binary_image.shape
    hough_acc = np.zeros((num_rows, num_cols), dtype=np.uint)

    edge_pixels = np.argwhere(binary_image > 0)

    thetas = np.deg2rad(np.arange(0, 360))  # Convert degrees to radians

    for y, x in edge_pixels:
        for theta in thetas:
            y_coord = y + 1
            x_coord = x + 1
            b = int(y_coord - radius * np.sin(theta))
            a = int(x_coord - radius * np.cos(theta))

            if 0 <= b < num_rows and 0 <= a < num_cols:
                hough_acc[b, a] += 1
    
    # NOTE: Rescaling hough_acc does not make sense here.
    return hough_acc


easy_img = image_processing_utils.read_image(easy_img_path)
edge_easy = feature.canny(easy_img, sigma=3)

hough_acc_20 = hough_circles_acc(edge_easy, radius=20)
hough_acc_27 = hough_circles_acc(edge_easy, radius=27)
In [ ]:
# Create a Matplotlib figure and a 2x2 grid of subplots
fig = plt.figure(figsize=(10, 7))
fig.suptitle("Each bright pixels in edge image votes once in Hough accumulator array", fontsize=12)

ax1 = plt.subplot2grid((2, 2), (0, 0))
ax2 = plt.subplot2grid((2, 2), (1, 0))
ax3 = plt.subplot2grid((2, 2), (0, 1))
ax4 = plt.subplot2grid((2, 2), (1, 1))

ax1.imshow(easy_img, cmap="gray")
ax1.set_title("Grayscale image", fontsize=12)
# ax1.axis("off")
ax2.imshow(edge_easy, cmap="gray")
ax2.set_title("Canny edge image. $\sigma$ = 3", fontsize=12)
# ax2.axis("off")

ax3.set_title(
    "Radius: 20 pixels",
    fontsize=10,
)
im = ax3.imshow(
    hough_acc_20,
    cmap="gray",
    aspect="auto",
    vmin=hough_acc_20.min(),
    vmax=hough_acc_20.max(),
)
ax3.set_xlabel("x")
ax3.set_ylabel("y")
cbar = fig.colorbar(im, ax=ax3, orientation="vertical")
cbar.set_label("Numbers of votes", rotation=270, labelpad=20)

ax4.set_title(
    "Radius: 27 pixels",
    fontsize=10,
)
im = ax4.imshow(
    hough_acc_27,
    cmap="gray",
    aspect="auto",
    vmin=hough_acc_27.min(),
    vmax=hough_acc_27.max(),
)
ax4.set_xlabel("x")
ax4.set_ylabel("y")
cbar = fig.colorbar(im, ax=ax4, orientation="vertical")
cbar.set_label("Numbers of votes", rotation=270, labelpad=20)

fig.tight_layout()
plt.show()
No description has been provided for this image

Discussion:

I first applied the Canny edge detector with a sigma value of 3 to the "Easy" image to obtain the edge image. Then, I processed the edge image through the hough_circles_acc function with two different radii, 20 and 27. Two important details are discussed below:

  • Hough Accumulator Array: When a fixed radius value is given, the Hough accumulator array is two-dimensional, and its shape matches that of the input edge image. This differs from the case of the Hough Line Transform.

  • Radius: It's evident that the choice of radius plays a significant role in circle detection. When the radius is set to 20 pixels, the algorithm primarily detects smaller coins with that radius. In contrast, when the radius is increased to 27 pixels, it becomes capable of detecting larger coins. The choice of radius is crucial in determining the size of circles that can be effectively detected in the image.

Finding Peaks and Drawing Circles¶

We can now apply the same peak-finding function, find_hough_peaks, but with different threshold values and neighborhood sizes to detect circles in Hough space.

In [ ]:
peaks = find_hough_peaks(hough_acc_20, num_peaks=5, threshold=130, neighborhood_size=50)
circles_20_img = hough_transform_utils.draw_circles(easy_img.copy(), peaks, 20)

peaks = find_hough_peaks(hough_acc_27, num_peaks=5, threshold=130, neighborhood_size=50)
circles_27_img = hough_transform_utils.draw_circles(easy_img.copy(), peaks, 27)

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(9, 4))
fig.suptitle("Detected circles with different radius.", fontsize=12)

ax[0].imshow(circles_20_img, cmap="gray")
ax[0].axis('off')
ax[0].set_title("Radius: 20 pixels")
ax[1].imshow(circles_27_img, cmap="gray")
ax[1].axis('off')
ax[1].set_title("Radius: 27 pixels")

fig.tight_layout()
plt.show()
No description has been provided for this image

Circle Hough Transform on "Realistic Images"¶

Finally, we can apply our circle detection pipeline to "realistic images," following the same approach as before. To do this, let's consolidate the previous functions and create a circle detection function named detect_circles. And then we can apply detect_circles to different difficulty levels of images.

In [ ]:
def detect_circles(img, radii, canny_sigma, threshold):
    edge_img = feature.canny(img, sigma=canny_sigma)
    
    circles = []
    for radius in radii:
        hough_acc = hough_circles_acc(edge_img, radius=radius)
        peaks = find_hough_peaks(hough_acc, num_peaks=10, threshold=threshold, neighborhood_size=40)
        if len(peaks) != 0:
            circles.append((radius, peaks))

    circles_img = img.copy()
    for radius, peaks in circles:
        circles_img = hough_transform_utils.draw_circles(circles_img, peaks, radius)
    
    return edge_img, circles_img
In [ ]:
easy_img = image_processing_utils.read_image(easy_img_path)
smooth_easy = filters.gaussian(easy_img, sigma=2)
middle_img = image_processing_utils.read_image(middle_img_path)
smooth_middle = filters.gaussian(middle_img, sigma=2)
hard_img = image_processing_utils.read_image(hard_img_path)
smooth_hard = filters.gaussian(hard_img, sigma=2)

easy_radii = np.arange(19, 30, 2)  # [19 21 23 25 27 29]
middle_radii = np.arange(20, 37, 2)  # [20 22 24 26 28 30 32 34 36]
hard_radii = np.arange(19, 35, 2)  # [19 21 23 25 27 29 31 33]

image_data = [
    (easy_img, easy_radii, 130, "Easy image"),
    (smooth_easy, easy_radii, 130, r"Gauss Easy, $\sigma=2$"),
    (middle_img, middle_radii, 110, "Middle image"),
    (smooth_middle, middle_radii, 90, r"Gauss Middle, $\sigma=2$"),
    (hard_img, hard_radii, 90, "Hard image"),
    (smooth_hard, hard_radii, 70, r"Gauss Hard, $\sigma=2$"),
]

fig, ax = plt.subplots(nrows=6, ncols=5, figsize=(18, 15))

for i, (img, radii, hough_thres, title) in enumerate(image_data):
    ax[i, 0].imshow(img, cmap="gray")
    ax[i, 0].text(-60, 250, title, fontsize=12, rotation="vertical", va="center")
    ax[i, 0].axis("off")

    edges, circles_img = detect_circles(
        img, radii, canny_sigma=2, threshold=hough_thres
    )
    ax[i, 1].imshow(edges, cmap="gray")
    ax[0, 1].set_title(r"$\sigma=2$", fontsize=12)
    ax[i, 1].axis("off")
    ax[i, 2].imshow(circles_img)
    ax[0, 2].set_title(r"Detected circles", fontsize=12)
    ax[i, 2].axis("off")

    edges, circles_img = detect_circles(
        img, radii, canny_sigma=3, threshold=hough_thres
    )
    ax[i, 3].imshow(edges, cmap="gray")
    ax[0, 3].set_title(r"$\sigma=3$", fontsize=12)
    ax[i, 3].axis("off")
    ax[i, 4].imshow(circles_img)
    ax[0, 4].set_title(r"Detected circles", fontsize=12)
    ax[i, 4].axis("off")

fig.tight_layout()
plt.show()
No description has been provided for this image

Discussion:

The process of detecting circles with the Hough Transform shares several similarities with detecting lines. Many of the discussions and concepts explored in the context of line Hough transform are applicable to circle detection as well.

One noteworthy addition to circle detection is the potential for optimization by considering a cluster of pixel values when finding peaks in Hough space. This approach can enhance the robustness of circle detection, especially in scenarios where circles may not align precisely with the grid of accumulator cells.

Conclusion¶

In conclusion, our exploration of the Hough Transform in computer vision has revealed its effectiveness in detecting both lines and circles. We've discussed key concepts and implementation details for line detection, emphasizing the importance of accumulator arrays and peak finding.

Throughout our discussions, we've emphasized the importance of parameter tuning, normalization, and understanding the characteristics of the problem at hand. These insights are crucial for achieving accurate and robust results in computer vision applications.

No description has been provided for this image
In [ ]:
# Code to generate the profile image `detected_lines_circles.png`.

# middle_img = image_processing_utils.read_image(middle_img_path)

# edges, lines_img = detect_lines(
#     middle_img, num_peaks=20, canny_sigma=2, threshold=40, neighborhood_size=20, filtering_lines=True
# )
# middle_radii = np.arange(20, 37, 2)  # [20 22 24 26 28 30 32 34 36]

# edges, circles_img = detect_circles(
#     middle_img, middle_radii, canny_sigma=2, threshold=110
# )

# middle_img *= 255
# middle_img[middle_img >= 255] = 255
# middle_img[middle_img < 0] = 0
# middle_img = middle_img.astype(np.uint8)

# limg = lines_img - middle_img[:, :, None]
# cimg = circles_img - middle_img[:, :, None]
# both_img = middle_img[:, :, None] + limg + cimg

# both_img[both_img >= 255] = 255
# both_img[both_img < 0] = 0
# both_img = both_img.astype(np.uint8)

# fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8, 8))
# ax.imshow(both_img, cmap="gray")
# ax.set_axis_off()
# plt.savefig('./images/detected_lines_circles.png', bbox_inches='tight', pad_inches=0)
# fig.tight_layout()
# plt.show()
In [ ]: